TREC 2010 Blog Track: Top Stories Identification

نویسندگان

  • Yeha Lee
  • Woosang Song
  • Hun-Young Jung
  • Vinh Tao Thanh
  • Jong-Hyeok Lee
چکیده

This paper describes our participation in the TREC 2010 Blog Track. For the Top Stories Identification Task, we explore the relationship among news events, news stories and blog posts. We first extract important news events from the TRC2 corpus using a probabilistic mixture model. Then, we propose a probabilistic approach to identify top news stories. Furthermore, we use an additional feature that can be useful in identifying top news stories. For the News Blog Post Ranking Task, we apply the Maximal Marginal Relevance method (MMR) to make the aspects of the blog posts more diverse.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Top Stories Identification From Blog to News in TREC 2010 Blog Track

In 2010 Blog Track, there are two tasks including Faceted Blog Distillation Task and Top Stories Identification Task. We mainly focus on the Top Stories Identification Task. In this task, there are two issues to solve. The first issue is ranking the important news stories on the specified day, named Story Ranking Task. The second issue is named News Blog Post Ranking Task. News Blog Post Rankin...

متن کامل

University of Lugano at TREC 2009 Blog Track

We report on the University of Lugano’s participation in the Blog track of TREC 2009. In particular we describe our system for performing blog distillation, faceted search and top stories identification.

متن کامل

ICTNET at Blog Track TREC

This paper describes our participation in blog track of TREC2009. All runs are submitted for both two task, namely Top stories identification task and faceted blog distillation task. The “FirteX” platform was used to index and retrieval posts. As for top stories identification task, to identify important headlines, we measure the importance of headline by accumulating the BM25 relevance score w...

متن کامل

Overview of the TREC-2010 Blog Track

• Top stories identification: A task that addresses news-related issues on the blogosphere, namely investigating whether the blogosphere can be leveraged to identify the top news stories of a given day in a real-time fashion. The task has also a search diversity flavour, where for a given story, a representative set of blog posts discussing the story from various perspectives [7] is shown to th...

متن کامل

From Blogs to News: Identifying Hot Topics in the Blogosphere

We describe the participation of the University of Amsterdam’s ILPS group in the blog track at TREC 2009. We focus on the top stories identification task, and take an approach that does not require the headlines of top stories to be known beforehand. We explore the feasibility of a so-called blogs to news approach: given a date and a set of blog posts, identify the main topics for that date. Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010